Applications of the streamed storage format for sparse matrix operations

نویسندگان

  • Dahai Guo
  • William Gropp
چکیده

The streamed storage format for sparse matrices showed good performance improvement for sparse matrix and vector multiply (SpMV) compared with compressed sparse row (CSR) and block CSR (BCSR) formats, particularly on IBM Power processors. We extend the format to exploit single instruction multiple data (SIMD) instructions in order to utilize the vector unit, and discuss how the streamed formats perform on the Power7 processor, which is the first eight-core chip from IBM. The streamed format is then applied to two more operations of sparse matrices, successive over-relaxation (SOR) iteration sweeps and incomplete lower and upper (ILU) triangular solvers. Basic solvers are developed for them in the high-performance computing (HPC) package PETSc. Test results on the IBM Power7 processor show that the SIMD instructions improve the performance of the streamed storage format on SpMV. The format also accelerates SOR iteration sweeps and ILU matrix solvers, compared with the traditional BCSR format used in PETSc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparse Matrix Storage Format

Operations on Sparse Matrices are the key computational kernels in many scientific and engineering applications. They are characterized with poor substantiated performance. It is not uncommon for microprocessors to gain only 10-20% of their peak floating-point performance when doing sparse matrix computations even when special vector processors have been added as coprocessor facilities. In this...

متن کامل

A Hierarchical Sparse Matrix Storage Format for Vector Processors

We describe and evaluate a Hierarchical Sparse Matrix (HiSM) storage format designed to be a unified format for sparse matrix applications on vector processors. The advantages that the format offers are low storage requirements, a flexible structure for element manipulations and allowing for efficient operations. To take full advantage of the format we also propose a vector architecture extensi...

متن کامل

Data Structures and Algorithms for Distributed Sparse Matrix Operations

We propose extensions of the classical row compressed storage format for sparse matrices. The extensions are designed to accomodate distributed storage of the matrix. We outline an implementation of the matrix-vector product using this distributed storage format, and give algorithms for building and using the communication structure between processors.

متن کامل

Sparse Matrix-Vector Multiplication on FPGAs

Floating-point Sparse Matrix-Vector Multiplication (SpMXV) is a key computational kernel in scientic and engineering applications. The poor data locality of sparse matrices signicantly reduces the performance of SpMXV on general-purpose processors, which rely heavily on the cache hierarchy to achieve high performance. The abundant hardware resources on current FPGAs provide new opportunities to...

متن کامل

Transposition Mechanism for Sparse Matrices on Vector Processors

Many scientific applications involve operations on sparse matrices. However, due to irregularities induced by the sparsity patterns, many operations on sparse matrices execute inefficiently on traditional scalar and vector architectures. To tackle this problem a scheme has been proposed consisting of two parts: (a) An extension to a vector architecture to support sparse matrix-vector multiplica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJHPCA

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2014